8 May 2017

intro

acknowledgements

  • "when I say 'I' i mean 'we' …"
  • People
    • Guillaume Blanchet (Sherbrooke)
    • Anna Norberg (Univ of Helsinki)
    • Otso Ovaskainen (NTNU, Univ of Helsinki)
  • Institutions (\($\))
    • NSERC
    • NimBIOS

outline

  • data set descriptions
  • "bake-off" of species distribution models
    • esp. joint species distribution models

data sets

diatoms

  • 7 stream systems \(\times\) 15 samples
    (105 samples total)
  • 499 species
  • predictors: shading, % particles, moss cover, water velocity, conductance, pH, water colour, total P
  • use first 4 PCA axes

fungi

  • 29 forests
    (21 managed, 8 natural)
  • 22,500 samples (logs)
  • 321 species
  • predictors: management status, log(diameter), decay class, decay class\(^2\)


UK butterflies

  • 2841 10 km \(\times\) 10 km grid cells
  • 55 species
  • predictors: # growing days > 5 C, % broadleaf woodland, % coniferous woodland, % calcareous substrates

small pearl-bordered fritillary (Boloria selene)


green hairstreak (Callophrys rubi)


Holly blue (Celastrina argiolus)

Australian plants

  • 30,000 locations (600 used)
  • 2729 herbaceous species
  • salinity, wetness, rainfall

Dawsonia longiseta

Sclerodontium pallidum

Abutilon theophrasti

Epilobium hirsutum

US trees

  • 100,000 locations (600 used); 2000-2010
  • 590 species
  • first 3 PCA axes (out of 38 climate and soil variables)

Larch

Black spruce

White spruce

Aspen

models

Univariate (stacked SDMs)

  • fit each species independently
    • algorithmic: boosted regression trees, random forest, support vector machine, multivariate regression trees, MARS, gradient nearest neighbour …
    • model-based: GLM, GAM
  • community-level predictions simply aggregate individual-species results

Multivariate (joint SDMs)

  • fit all species simultaneously, allow \(\pm\) dependencies
    • species archetype models (SAM)
    • neural networks (MISTNET)
    • hierarchical GLM (HMSC)
  • potential improvement in accuracy
  • more direct community-level predictions

sampling

  • hierarchical (discrete) vs. spatially explicit
  • interpolation vs extrapolation
  • some large data sets subsampled

assessment statistics

  • joint models could improve individual predictions …
  • … but expect more obvious effects on community-level response

results

summary (terciles)

summary
(species occupancy: Tjur \(R^2\))

summary
(species occupancy: deviance)

summary
(species richness: MSE per site)

timings

discussion

caveats

  • work in progress
  • methodological attribution fallacy
  • algorithmic methods handicapped?
  • fundamental vs. realized niche
  • desirable extensions:
    dispersal, phylogenetic structure, trophic interactions

phenomenological/mechanistic

  • context-dependent
  • many mechanisms per functional form:
    e.g. Michaelis-Menten/Monod/Beverton-Holt
  • semimechanistic models
    (Wood (2001); e.g Merow et al. (2011))
  • linearizable models (e.g. Dennis et al. (2006))
  • prediction vs estimation …

    (Peters 1991; Breiman 2001)

the way forward

  • tradeoffs/constraints
    • identifiability
    • computational efficiency
    • interpretability
    • flexibility
  • ecological-mechanistic \(\leftrightarrow\)
    model-based-statistical (nonlinear) \(\leftrightarrow\) model-based-statistical (linear) \(\leftrightarrow\)
    algorithmic-statistical

References